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Abstract 

Many interesting application systems, ranging from work- 
flow management and CSCW to air traffic control, are event- 
dr iven and time-dependent and must interact with heteroge - 
n eous comp onents in the real wor\q\ Kvent smir*t nr* u^A 
to glue together distributed components. They assume a vir- 
tual global time base to trigger actions and to order events. 
The notion of a global time that is provided by synchronized 
local clocks in distributed systems has a fundamental impact 
on the semantics of event-driven systems, especially the com- 
position of events. The well studied 2g~precedence model, 
which assumes that the granularity of global time-base g can 
be derived from a priori known and bounded precision of 
local clocks may not be suitable for the Internet where the 
accuracy and e xternal synchronizatio n of l ocal clocks is best 
effort and cannot be guaranteed because of large transmis- 
sion delay variations and phases of disconnection. In this 
paper we introduce a mechanism based on NTP synchronized 
local clocks with global reference time injected by GPS time 
servers. We argue that ti mestamps of events can be related to 
global reference time with bounded accuracy and propose 
that e vent timestamps are modeled using accuracy interva ls. 
We present algorithms for event composition and event con- 
sumption which make use of accuracy interval based times- 
tamping and illustrate the problems that arise due to 
inaccuracy and message transmission delays. 



I. Introduction 

Event-based computing is an emerging paradigm for 
composing applications in open, heterogeneous distributed 
environments [4,23,20,13]. Applications like workflow man- 
age men t [7,19,14], CSCW [5] and monitoring applications 
ranging from Air Traffic Control [3,29] to Health Care Sys- 
tems [12) may be constructed by leveraging event services for 
detection and distribution of events in a publish/subscribe 
manner. The use of generic event services requires that the 
semantics of event services that is presented to the application 
developer be not only formally specified [45.49] but also 
unambiguous. Failing to do so may cause mission-critical 

1 Also 1SISTAN, Faculty of Sciences, UNICEN, Tandil. Argentina. 



applications to malfunction or behave indetcrministically, and 
may result in unreliable software and impose unacceptable 
risks. 

The use of absolute and relative temporal events to trig- 
ger actions, the need to measure duration of activities, and the 
detection and composition of events that may originate in dis- 
tributed components that are loosely coupled render distrib- 
uted event-driven systems time-dependent. A well defined 
event service depends on three basic factors: the proper inter- 
pretation of time, the adoption of partial order of events and 
the consideration of transmission delays between producers 
and consumers of events. In order to describe and detect com- 
plex situations, advanced event services provide the notion of 
composite events. Typically we are interested in causal 
dependencies between real-world happenings or computa- 
tions. Temporal order is a prerequisite for causal order. There- 
fore, potential causality can be detected - or excluded - when 
examining the order of event occurrences. However, occur- 
rence lime and global order of events can only be determined 
by an omniscient external observer. In practice, detection and 
timestamping of events is delayed from the instant of occur- 
rence. Additionally, time as provided by a distributed time 
service is imprecise with respect to clock readings at different 
nodes and inaccurate with respect to physical time. As a con- 
sequence, timestamps are inherently inaccurate and may dis- 
tort the reaJ order of occurrence of events. The inability to 
provide precise and accurate timestamps has additional 
impact on event consumption, i.e. the selection of events that 
are to be composed. Consumption policies like recent and 
chronicle rely upon the temporal order of events when select- 
ing the latest events (recent) or the oldest events (chronicle) 
out of the event stream. Furthermore, event consumption must 
contemplate variable transmission delays, especially in the 
case of multiple, independent remote publishers. 

In this paper we focus on timestamping and composition 
of events in large scaJe. loosely coupled, distributed systems 
without centralized management, like the Internet. Unpredict- 
able bounds and large variations on message transmission 
delays, possible phases of disconnection and independent 
failure modes are characteristic for such an environment and 
complicate the realization of a general purpose event service. 
In particular, it is not possible to determine a-pricri the preci- 
sion bounds for all local clocks in the system. Therefore, we 



0-7695-0384-5/99 $10.00 © 1999 IEEE 



70 



argue that ordering of events based on a sparse lime base or 
the 2g-precedence model does not scale up to the Internet. 
In our solution we make use of the Network Time Protocol 
(NTP). 

The remainder of this paper is organized as follows. 
Next, an overview of related work is presented. Section III. 
introduces the concept of global time based upon synchro- 
nized local clocks. We give a brief overview on NTP time 
services and then present a mechanism for timestamping 
events based upon accuracy intervals. We introduce an 
accuracy interval order that is the basis for event composi- 
tion and consumption. Section IV. shortly describes the 
architecture of our event service. After that we discuss the 
implementation of simple event composition operators and 
point out the potential pitfalls due to the very nature of dis- 
tributed systems. Finally we address open issues and 
present current and future work. 

II. Related Work 

General-purpose event notification services have been 
proposed recently as part of major middleware initiatives 
[37,38,39,20,31]. However, most of them are restricted to 
primitive events and do not consider any consumption poli- 
cies. 

Composition of events was proposed together with the 
concept of Event-Condition- Action rules in active data- 
bases 1 101. Active databases support composite events but 
assume the existence of a totally ordered event history, and 
therefore, are restricted to centralized systems.. Active data- 
bases handle database events, temporal events, and user- 
defined events. HiPAC (11] considered ECA rules in gen- 
eral, and provided basic mechanisms for composite event 
specification. Compose [18] introduced powerful event 
operators. Snoop [8] introduced a formal definition of prim- 
itive and composite events based on a global history log, 
and four event consumption policies: recent, chronicle, con- 
tinuous and cumulative. Reach [6] provided mechanisms 
for efficient detection and composition based on the 
SAMOS [16] algebra. Ode [22] proposed complex event 
composition but used timestamps for event identification 
and required a total ordering. Recent efforts have concen- 
trated on unbundling database functionality to provide, 
among others, active functionality services through config- 
urable components [17,25]. None of the previously men- 
tioned approaches has addressed properly the problems of 
global time, imprecise timestamps of events, and composi- 
tion delays. Instead, they all assume a total ordering of 
events. 

In [27], Lamport presented the happened before rela- 
tion, which defines a partial ordering of events based on the 
causality principle. An event a happened before an event b 
(depicted a -» b ) if a could have influenced b\ a and b are 
said to be causally dependent. If neither a -* b nor b -> a , 
the events are said to be concurrent and causally indepen- 
dent. A system of logical clocks is introduced which 
assigns a natural number to each event (logical timesiamp). 
Logical clocks are consistent with causality [4 1 J: if a -> b , 
then a's timestamp is smaller than b's timestamp - the con- 



trary is not true. In [41] the concept of vector time is pre- 
sented and it is shown that vector time characterizes 
causality: two events are ordered by vector time iff they are 
causally dependent. However, neither logical clocks nor 
vector clocks can deal with causal relations that are estab- 
lished through hidden channels and also can not represent 
timed real world events. Thus they are not appropriate for 
open systems. 

In [24,47] a global time approximation is proposed, 
assuming that the maximum time difference between any 
two clocks at the same instant of time is bounded by 5 . The 
granularity condition states that the granularity of the glo- 
bal lime-base g should not be smaller than 8 , g > 5 , ensur- 
ing that global clocks do not overlap. A global and total 
order of events can be determined if event timestamps are 
two or more clock ticks apart, a fact known as Ig-prcce- 
dence. If this assumption does not hold in all cases, one has 
to face partial ordering of events. 

Schwiderski [42] adopted the 2g-precedence model to 
deal with distributed event ordering and composite event 
detection. She proposed a distributed event detector based 
on a global event tree and introduced 2g-precedence based 
sequence and concurrency operators. However, event con- 
sumption is non-deterministic in the case of concurrent or 
unrelated events. Additionally, the violation of the granular- 
ity condition may lead to the detection of spurious events. 

The Cambridge Event Architecture (CEA) [2] presents 
the publish- register-notify paradigm. Mediators provide the 
means to compose events. CEA is oriented to support mul- 
timedia, mobility, group interaction and composition of het- 
erogeneous software components [5]. The implementation 
of CEA is based on a proprietary RPC system, limiting 
interoperability. Recently, COBEA [31] was proposed, 
which extends the CORBA Event Service [37] with' the 
CEA publish-register-notify paradigm, supporting fault tol- 
erance, composite events, server-side filtering and access 
control. 

In EVE [19,45] an event-based middleware layer is 
proposed as platform for a workflow enactment system. 
The workflow is mapped to services and brokers. The 
behavior of brokers is defined by ECA-rulcs using compo- 
sition of distributed events. Specifically, EVE requires 
chronicle consumption mode of events to correctly interpret 
workflow notifications. 

In CEA, COBEA and EVE. the detection of global 
composite events is based on Schwiderski's approach. 

[49] presents a formal refinement of Schwiderski's 
approach and extends the Snoop event algebra to support 
event composition in distributed environments. 

The 2g-precedence based approaches cited above do 
not scale to open systems and still are ambiguous with 
respect to event consumption. 

III. Timestamping and Global Time 

We will give a short overview of the concept of global 
time and distinguish between internal and external clock 
synchronization algorithms. We then present how we lever- 
age upon a time service like NTP for provision of a global 
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reference time and introduce the concept of accuracy inter- 
vals. We define abstract interfaces for local as well as glo- 
bal clock readings used for timestamping events. 

If we are merely interested in relative ordering of 
events detected at the same node, a monotonically increas- 
ing counter, e.g. the local clock reading, might be sufficient. 
In the real world, we must differentiate between the occur- 
rence of an event and the lime it takes until detection. We 
have to distinguish the case where it can be assured - at the 
application level - that occurrence and detection of distinct 
events never overlap such that limestamps at detection time 
always reflect the order of occurrence. The more realistic 
scenario is however, that timestamping of local events does 
not yield a total order because there is uncertainty about 
occurrence time and detection time of events. We will 
therefore define a - partial - local order that recognizes this 
fact and a - partial • global order that additionally respects 
the inaccuracy which is inherent in the artificial notion of 
reference time. 

A. Clock Synchronization 

The instant of time at which an event occurs in the 
physical world will be called the physical time of the event. 
Reference time RT - as provided by UTC or GPS time - is a 
granular representation of dense physical time. Note that 
reference time is a conceptual artifact and inaccurate by 
nature. In fact GPS lime servers carry an error encompass- 
ing relativistic effects as well as more significant inaccura- 
cies due to synchronization and clock reading errors. 

In order to provide a global timebase in distributed 
systems, a common solution is to create a virtual clock at 
each node using a local hardware clock. The clock synchro- 
nization problem consists of reaching some degree of 
mutual consistency between virtual clocks and compensat- 
ing for hardware clock skew and frequency drift. Note, that 
perfect synchrony cannot be achieved by the very nature of 
our universe. 

A virtual clock is represented by a function 
C(t): RT~> CT t CT<zRT that maps reference time to 
clock time CT. A hardware clock typically consists of an 
oscillator and a counting register thai is incremented by the 
ticks of the oscillator. The hardware clock has a certain 
granularity G by which the counter can be incremented. For 
a local hardware clock to be correct, we require a bounded 
drift rate: 

Linear Envelope: 
s.l e RT: sZt 

(l-p)(/-;)-CSC(/)-C(i)S{| +p)(/-j) + C 

For most modern hardware clocks the constant p is in 
the order of 10" 4 to 10* 6 , i.e. the clock drifts more than 0.06 
milliseconds in one minute which compares to 6000 
instructions on a 100 MIPS machine. 

Internal clock synchronization consists of keeping vir- 
tual clocks within some maximum deviation from each 
other, i.e. for all correct clocks C i( Cj it is guaranteed: 

Precision: 35 : \C{t) - C//)| S 5 . t e RT 



External clock synchronization aims at maintaining 
virtual clocks within some maximum deviation from a time 
reference external to the system, i.e for each correct clock 
Q it is guaranteed: 

Accuracy: 3a : |C ( {/)- /) <, a . t € RT 

Internal clock synchronization algorithms [43,26,30] 
guarantee precision in case of known bounds on transmis- 
sion delays of the network. Otherwise, internal clock syn- 
chronization is best effort (9,46) and precision 5 cannot be 
a-priori determined for all r. As accuracy a always implies 
precision 2a. externally synchronized clocks are also inter- 
nally synchronized. At the opposite, internally synchro- 
nized clocks do not necessarily maintain accuracy with 
respect to external reference time. If accuracy is a require- 
ment, internal clock synchronization algorithms can be 
integrated with external clock synchronization as in recent 
hybrid clock synchronization algorithms [15,40.46]. 

Timestamping based on internal clock synchronization 
and the application of the 2g-precedence model [42.47) for 
ordering and composing events does not scale to loosely 
coupled distributed systems like the Internet. As transmis- 
sion delays vary significantly and are in general not known 
a-priori for all nodes of the network, it is not feasible to 
determine a precision 6 that holds for all /. For the same 
reason such an approach is not suitable for mobile environ- 
ments [44] with long phases of disconnection. In fact, the 
above approaches merely present viable solutions for sys- 
tems interconnected by real-time networks or selected 
broadcast based LANs with restricted load patterns, where 
ai design time it is possible to determine and guarantee a 
bound on 5 for a!! instants t and all virtual clocks of the sys- 
tem [471. 

B. Time Service 

The Network Time Protocol defines an architecture for 
a lime service and a protocol to distribute accurate time 
information in a large, unmanaged global-internet environ- 
ment and is established as an Internet Standard protocol 
[33]. The participating nodes form a logical synchroniza- 
tion subnet whose levels are called strata. Primary servers 
at stratum 1 are directly connected to a time source such as 
a radio clock or a GPS receiver and provide accurate UTC 
reference time with an error ranging from some millisec- 
onds down to a few microseconds [21] - whereas GPS time 
itself is accurate in the order of 30 nanoseconds [28]. Sec- 
ondary servers at stratum 2 synchronize their clock with 
respect to stratum 1 peers plus other servers of stratum 2, 
servers at stratum 3 synchronize with stratum 2 peers and 
so on. The synchronization scheme consists of a peer selec- 
tion algorithm and estimation of the offset for the local 
clock with respect to reference lime provided by the 
selected peer. The peer selection algorithm chooses the best 
peer which is supposed to provide reliable and accurate 
time information. Calculating an estimation for the clock 
offset is based on exchanging timestamps between peers, as 
proposed by Crisiian [9). Additionally, statistical Filters are 
applied to a recent sample population which significantly 
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reduces ihe error of the estimated offset. A detailed perfor- 
mance study of NTP can be found in [34], 

C. Times tamping of Events 

NTP provides a reliable error bound, the synchroniza- 
tion distance, thai accounts for inaccuracies due to clock 
skew and offset estimation along the path to the primary 
reference server, plus the inaccuracy of the primary server's 
clock with respect to reference time. In [35] a new system 
call ntp_gettime ( ) is introduced for reading the virtual 
global clock that additionally returns a reliable error bound 
with respect to reference time. The CORBA TimeService 
[36] proposes an abstract interface that supports clock read- 
ings and additionally returns an error bound, the purpose of 
which is to wrap existing time service implementations 
such as NTP or DCE TimeService. In the following we will 
present our abstract view on a clock reading interface for 
which the above approaches provide a viable implementa- 
tion. Let us first introduce the notion of accuracy intervals 
as proposed in [32,40]. 

Accuracy Interval: We define the accuracy interval with 
reference point / re ^e /?7and accuracy [a":a + ];.a\ct + € RT 

For convenience we use the shorthand notations (/ re y± aj, 
ot=[a";a + ], iower([a-;a+))-a' and upper([a-;a+})=a+. 

Global Time Service: The global time service provides a 
function getjimeQ - when called at physical lime f, 
get_time() returns the reading of the local virtual clock C(t) 
together with a reliable error bound synchdist r 
We require the global time service to be correct. 

Correctness of Time Service: If gttjime{) is called at 
physical time / and returns Qt) with error synchdist, then: 
t 6 [C(/) - synchdist , ; C(/) + synchdist r \ 

Let 'occ^J ^ me instant of lime when event e 
occurred. Actually, it lakes some time Idd until the event is 
detected and is assigned a timestamp. We call idd the local 
detection delay and denote with t d Je) the detection time of 
the event. In the following, we assume that an individual 
upper bound idd is known for each node of the system. 
Local Detection Delay: 
3idd € RT :t occ (e)e U d Je) - Idd ; t dtl (e)\ 

The effect of the delay depends largely on the signal- 
ling source. For example, the minimum delay in the detec- 
tion of a local method event is caused by a timer system 
call. On a SUN SS 10 with two CPUs at.55 Mhz the timer 
system call lakes about 5 usee and it takes about 0,5 jisec 
on a SUN Ultra II with two CPUs at 300 Mhz, whereas the 
granularity G of (he local clock is I jisec in both cases. 
In other words, the impact of Idd may be insignificant com- 
pared to the inaccuracy imposed by the clock granularity on 
the fast machine. However, on slow machines like the SS 10 
or in cases where the event is signaled by some external 
device, Idd may be significantly larger then clock granular- 
ity and additionally increases the inaccuracy of the global 
timestamp. 



The local detection delay is laken into account by 
timestamping event e as: 

Global Timestamp: 
is(<0 = lC(/ Af )±ol 

a » [synchdist^ + Idd; synchdist^ ) 

The fact that ihe global timestamp ts(e) contains t (KC (e) can 
easily be seen from the above definitions, because 

* tjjc) - idd 2 C(t de J - synchdist^ - Idd 
and t occ {e) £ t de Jte) S C(t d J + synchdist^" . 
We denote the length of the error interval a as the inaccu- 
racy of the timestamp. 

D. Ordering of Events 

We define a partial order on accuracy intervals as follows: 
Accuracy Interval Order: 

tj<f k « V J€ Ij.Vte i k :s<t 

Accuracy interval order is merely a partial order. Obvi- 
ously there exist accuracy intervals h y I k such that neither 
Ij<I k nor I k <Ij holds. We define the order of two events to be 
uncertain if they cannot be ordered and introduce Ihe nota- 
tion /yl/ t s -ilj<lt)*-<t k <ij) . As we cannot decide 
on the order of events in such cases, the event service 
should take well defined actions, as we will discuss later on. 
Depending on ihe application, the inaccuracy of limcstamps 
can be small with respect to the temporal offset between 
causally dependent events. In this case, a well defined 
application should never generate uncertain events. How- 
ever, if uncertain event orders occur, they should be 
resolved by application semantics. It should be noted at this 
point, that the worst resolution policy, i.e. ignoring the 
uncertainty of event order, does not perform worse then pre- 
vious approaches discussed in Section II. 
With our approach we can guarantee in all cases that: 

•situations of uncertain event order are detected and the 
action taken is well defined 

•events are not erroneously ordered. 
More precisely, we can guarantee that accuracy interval 
order is consistent with physical time order, i.e. the follow- 
ing important property holds: 

Time Consistent Order: Given events cj, c k and 
«<«/> a '/W*/». fJ <'*> = VW«*)> then 

This proposition follows directly from the previous 
definitions of global timestamp and accuracy interval order, 
under the assumption that the time service is correct. 

If the expected values of synchdist are sufficiently 
small, for example when detecting events at a stratum I 
server attached to GPS, it may be sufficient to order events 
based on ordering of global limestamps, as defined above. 
In many settings however, event detection runs at nodes of a 
lower stratum and reading the clock results in large synch- 
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dtst values (10-50 msec and more) with respect to the gran- 
ularity of the local clock. Therefore we additionally provide 
a mechanism for the relative ordering of events * originating 
from the same node - based on local clock readings. 

We assume that the local clock is monotonically 
increasing and that clock discipline by NTP uses continu- 
ous amortization. Let ej t be events originating at the same 
node, then we assign the local clock readings as local times- 
tamps: 

Local Tune Stamp: If ej is detected at node N with local 
detection delay Idd we define: It s(ep = C(t det (ep) . 

We are interested in a lime consistent order for local 
timestamps. We know from the definition of local detection 
delay, that t det (ep< t d Je k )-ldd =* ' acc (*j)< t oce (e k ) . In 
other words we have to find a lower bound for the distance 
'rf rl (e t ) - t d( /je { ) , which can only be approximated by local 
clock readings. Let us assume that there are no resynchroni- 
zations between the two clock readings, then we know from 
the linear clock drift, that C(/)- C(r)S(l + p)(/-j) + C. 
Additionally we have to consider rate adjustments by the 
clock discipline. For simplicity, we assume that there is a 
known upper bound u for a positive , rate adjustment 
between two resynchronization points. Then we obtain: 

=» ldd< (TTp) tjj€k) ~ 

We now can specify the condition to order local times- 
tamps while considering the local detection delay: 

Local Timestamp Order: Let tts(c } \ tis(e k ) be local 
timestamps of events detected at the same node. 

It siej) < lts{e k ) 

(1+p) 

We refer to Schmid and Schossmaier [40] for a 
detailed discussion on how to estimate duration measure- 
ments using local clock readings, where they also discuss 
various models of local clocks and clock discipline mecha- 
nisms. 

IV. Notification Service 

In this section we describe the overall architecture of 
our event notification service and look into the implementa- 
tion details of event composition using accuracy interval 
based timestamping. Fig. 1. depicts the main components of 
the event notification service. 

The architecture is similar to that of a push-style 
CORBA Notification Service [38]. Producer and consumer 
of events interact with the event channel through proxy 
interfaces: ECPI (producer) and ECCI (consumer). The 
channel itself is a conceptual artifact realized on top of mul- 
ticast messaging middleware that provides a subject-based 
addressing scheme [39). Producers of events register meta- 
data for event type descriptions with the EvcntTypeReposi- 
tory. Consumers as well as other producers may query the 
repository to find out about existing event types. If a sub- 



scriber registers interest for some type of event an appropri- 
ate ECCI proxy will be returned. This proxy is created by 
an administrative factory object and relays primitive event 
notifications received by the multicast messaging layer to 
the consumer. A producer publishes events through the call 
of ECPI: :signalEvent( Event e) which also adds a local and 
global timestamp and the producer name to the event 
parameters. A consumer may connect directly to the ECCI 
proxy to be notified of primitive event occurrences. Com- 
posite events are detected by specialized ECCI proxies: In 
the first stage primitive events are captured by InputNodes 
(I), encapsulating the appropriate ECCI, and then passed on 
to the CompositionNode (C) where the operator logic is 
implemented and consumption takes place. Finally, if a 
composite event is detected, it is signaled to the consumer. 
As we will show later, the CompositionNode may raise 
exceptions to inform the application of ambiguities in the 
case when candidate events cannot be ordered. 




Fig. 1. Notification Service Architecture. 



Events are reliably delivered to subscribers by the 
underlying messaging middleware and it is also guaranteed 
that events are sent by a producer in the detection order and 
that this order is preserved by the channel. 

A publish/subscribe event service per definition must 
support many-to-many communication. As a consequence 
the semantic of group membership impacts the Composi- 
tionNode subscribers, because we need to know which pro- 
ducers might have sent events that must be considered for 
composition. We provide two different group membership 
semantics: atomic membership and weak membership. 
When using atomic membership, a producer registers with 
the DirectorySeryice and must not start sending events 
before all consumers, which are subscribed to the respective 
type of events, have been notified of the new group mem- 
ber. We leverage on the event service itself to reliably 
broadcast dedicated control events, such as a group mem- 
bership change event. When subscribing for some type of 
event a consumer may also request a list of currently active 
publishers. In the case of weak membership we delegate to 
the dynamic discovery protocol provided by the multicast 
messaging middleware. In that case a publisher can register 
without blocking at the DirectoryService. It is then possible 
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thai some events of the joined publisher arrive late and 
invalidate former event compositions. Atomic membership 
prohibits such errors. 

As will be discussed in the next section, we introduce 
a windowing scheme combined with heartbeat events to 
cope with node failures of consumers and network failures 
like poor response limes or partitioning of the network. 

v. Composition and Consumption 

To illustrate the impact of timestamp inaccuracy and 
varying transmission delays on event composition and con- 
sumption we will look at the simple composite event 
expression A<$fl, which depicts the situation that an event 
of type A and an event of type B occurred. Although the 
logic of the operator does not seem to impose any ordering 
constraints, consumption of events must be considered. 
Assume there is one producer P A for type A events and 
there are two producers PJ B , P2 B for type B events which 
signal to an A&B ComposiiionNode, as shown below: 




Fig. 2. Scenario. 

There can be multiple A events and multiple B events, 
even from different nodes, thai are candidates to make up 
the composite event. In chronicle consumption mode we 
want to combine the oldest As and Bs. In recent consump- 
tion mode wc are looking for the latest events, i.e. lately 
occurred events will rule out older ones. In the following, 
we will assume that the CompositionNode contains a par- 
tially ordered list for each operand. Let POList<A> be a 
data structure that holds type A events and POList<B> the 
one to hold type B events. The method POListo.oldestf) 
returns the set of oldest events which are those events that 
are not preceded by any other in the POUsto: 

VTv.e e / > OZ.iif<E>.oldest() : 
-. (3e'e / > OLi'j/<E>.o!dest() : ts(c') <rs(e) ) 

Note that oldestf) may benefit from the fact that there 
is only one producer for type A events and there is no need 
to relate to reference time, as it would be when implement- 
ing the sequence operator. The optimization then would be 
to use the local timestamp order instead of the global times- 
tamp-order. 

A. Window Mechanism 

We mentioned in the beginning, that wc have to con- 
sider the impact of individual transmission delays. The time 
diagram shown in Fig. 3. illustrates the problems that may 
arise. With the arrival of 6'oat time i, we detect a tentative 
composite a 0 &.b o event. However, we must consider the 
possibility that there is another A event on its way, which 
occurred at approximately the same time as a 0 . i.e. o 0 J. a, . 
When a, arrives at t 2 we now can be sure thai &q is the old- 
est A event and must be considered for composition. In the 



case of B events we have to additionally consider the fact 
that there are two producers, i.e. when receiving fr'o there 
could be events both at P/ B and P2 B that have not yet been 
delivered but would be element of POLis«B>.oldest(). In 
general, we require POUst<B>.oldest() to be stable before 
constructing a composite event We are using a window 
mechanism with so called sync-points to separate the his- 
tory of events as seen by the ComposiiionNode - reflected 
in the operand POListo data structures • into the stable 
past and the unstable past and present that still are subject 
to change. 



Fig. 3. Time diagram (global timestamps). 

We define the local sync-point lts syac (P A ) with respect 
to a producer P A to denote the fact that there are no more 
events a detected at P A that have not been signaled to the 
CompositionNode and loweriJts{a)) < lts STn JiP A ) . The local 
sync-point moves on with each event detection and is deter- 
mined by approximating a local clock value that is at least 
Idd below the local timestamp of the latest event In a simi- 
lar way we define the global sync-point ts $XRC {P A ) of a pro- 
ducer P A such that there are no more events a at P A that 
have not been signaled to the ComposiiionNode and 
lower{te{a)) < is sync (P A ) . Whereas the local sync point 
refers to local clock time the global sync-point relates to 
reference time. Obviously, the global sync-point with 
respect to a producer P A is equivalent to the lower end of 
the global timestamp of the latest detected event. In fact, 
with each event received by the consumer the respective 
sync-point windows move along 1 . For example in Fig. 3. 
the global sync-point for PI B is /, = tower{b o) when 6*o 
is received and moves to tower{b X \). We call 
POList<B>.otdest() to be stable, if there are no more pend- 
ing events b such that b would also belong to 
POList<B>.oldest(). If all global sync-points are at the 
right of the oldest timestamp in POList<B>.oldest() then 
there can be no pending event that intersects with all times- 
tamps in POList<B>.old*st(). Without proof we present the 
formal predicate for stability. 

Stability: Given POList<E> and the known set of produc- 
ers for E events, PR(E): 

is_siableC/ > 0/.iJ/<E>.oldest()) <=> 

min < c POLht<E>.o\^ u PP er < ts ^ < 

By definition we consider the empty set not to be stable. 

t. Special aitenfton is needed, when ihe jyndtdiit error signifi- 
cmily increase* 
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B. Composition 

Now that wc can determine if the candidate sets are 
stable, wc can present the algorithms for conjunction using 
the chronicle policy. The activity diagram below shows the 
execution flow when processing incoming events. First the 
sync-points arc updated with respect to the sender of the 
event 



UipulNod* 



ComposlUonNod* 



Mpd«t*9y»cJ'etnit 



•*»hi«l* 



Fig. 4. Activity diagram. 

Then we evaluate the operand lists and check if there 
arc stable events that can be composed. At the end we clean 
up the operand lists. Below we sketch the algorithms imple- 
mented in the CompositioriNode: 
SignalEvenUEvenr e):{ 
switch typeof(e) 

case heartbeat: break; 
case A: POList<A>.add(e); 
update_sync_poi nts(e): 
whilcf cvaluateO >; 
clcanupO; 
break; 

case B: // analogous to above 

) 

evaluate: returns boolean { 
// AND<broniclc 
Set<A> dd«t_a; Set<B> oldest_b; 
if <not_empty(POList<A> and not_empiy(POList<B>)} 
oIdest.a=POLisl<A>.oldestO; 
if (is,stab!e(oldest_a)) 
if (sizeof(oldest_a) > 1) 

// (exception multiple a) 
old«t_b=POList<B>.o!destO: 
if (is.stable(oldest_b)) 
if(siieof(oldcstJ>)> 1) 

// exception (multiple b) 
compose(oldest.a, oldesi_b>; 
return (TRUE); // A & B 

eke 

//expect sync-point to increase 
retum(FALSE); 

else 

// expect sync-point to increase 
rctura(FALSE); 
retum(FALSE); 

I 

C. Heartbeat 

In the case that oldest _p or oldest J> is not stable yet, 
we must wait for the global sync-points to be increased. 
This will either be in case of following A or B events, 
which again trigger the evaluation algorithm, or in case 
heartbeat events are signaled. We require producers to sig- 



nal events with a minimum frequency. If the event stream is 
less frequent or no more events occur at some producer 
node, the producer will generate an artificial heartbeat event 
for the sake of increasing the sync-point window. When a 
producer crashes or the network is partitioned for long peri- 
ods then the CompositionNode could be blocked - possibly 
indefinitely. This problem is dealt with by using timeouts in 
the InputNode which in turn raise an exception at the con- 
sumer. 

D. Accepting Uncertainty 

Because the accuracy interval order is only a partial 
order of events, the situation may arise that we cannot 
uniquely identify an oldest event. As can be seen from the 
definition of the oldestQ method, the result may be a set of 
events, with uncertain temporal order. In the above example 
of Fig. 3.. oldest J> contains 6*0 and b\ . This situation is 
considered to be exceptional in a sense that the event ser- 
vice cannot guarantee the proposed semantic of chronicle 
consumption. Therefore we explicitly raise an exception. 
Alternatively we could present the operand candidate sets 
oldest _a and oldestjb to the application and let the user 
decide. 

In the following we will illustrate the effect of uncer- 
tainty on order dependent operators. As an example we use 
the simple sequence operator A:B, We implement the evalu- 
ated method as follows: 

evaluate: returns boolean { 
// SEQUENCE-chronicle 
Set<A> oldest_a: Set<B> oldesi_b; 
if (not_cmpty(POLisi<A> and not_empty(POList<B») 
oldcst_a=POUst<A>.oldesiO: 
if (ii_stable(oldest_a)) ' 
if (sizeof(oldesl_a) > I) * 

//exception (multiple a) 
oldest_b = POList<B>.oldeslFollowing(oldest_a); 
if (is_stable(oldest_b)) 
if (sizeof(oldesi_b) > i) 
// exception (multiple b) 

else 

compose (oldest^a. oldest.b) 
return (TRUE): // A ; B 

else 

// expect sync-point to increase 
retumfFALSE); 

else 

// expect sync-point to increase 
rtturn(FALSE): 

else 

retum(FALSE); 

} 

The method POUst<>.oldestfiotlowing(Set<>) 
returns (he set of oldest events which are those events ihat 
arc following the oldest event in Seto and are not preceded 
by any other in the POListo: 

V7"::* e POLis «E>.oldest Folio wing(Se«F>) : 
f «,-„e Sct<F> , toweri( min ) = mm fe ^^(/owertO) 

-> ( 3 e' e POList <E>.oldestFollowing(Set<F>) : ts(c') <ts(e) ) 
Note thai the above evaluated aJgorithm presents the 
most strict implementation of the sequence operator. In fact, 
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there could be pairs of events a e oldest _a and b 6 oldest J> 
for which a<b holds. However, the notification service may 
not silently decide upon which events to compose. We sug- 
gest that the user may specify a callback to implement 
application specific selection policies. On the other hand 
we can say. that if we do not explicitly recognize such situ- 
ations, then there is the possibility for erroneously signaling 
a complex situation that actually did not occur. 

VI. Conclusions and Future Work 

Previous work on event composition in distributed 
environments either does not consider the possibility of par- 
tial event ordering or is based on the 2g-precedence model. 
Therefore, existing approaches suffer from one or more of 
the following drawbacks: lack of applicability to large scale 
open systems, possibility of spurious events and ambiguous 
event consumption. 

In this paper we present a new approach for times- 
tamping events in a large-scale, loosely coupled distributed 
system. We use accuracy intervals with reliable error 
bounds for timestamping of events reflecting the inherent 
inaccuracy in time measurements. Wc leverage existing 
time service implementations, like the Network Time Pro- 
tocol, that provide reference time injected by GPS lime 
servers and additionally return reliable error bounds. 

We propose a window mechanism to deal with varying 
transmission delays when composing events from different 
event sources. Most important, when delecting composite 
events we explicitly consider ihe fact that events can only 
be partially ordered. We introduce an accuracy interval 
order that guarantees the property of time consistent order. 
events are not erroneously ordered and situations of uncer- 
tain event order are always detected and signaled to the 
application. Thereby, event consumption modes like recent 
and chronicle can be unambiguously defined. In our ongo- 
ing research wc examine different strategies to handle 
uncertainty of event order. Possible approaches could be to 
provide policies as service configuration options or to intro- 
duce up-calls to the application level to let the user decide 
and make event composition programmable. 

As many applications like CSCW need more powerful 
temporal relations between composite events {48], we sug- 
gest to ihink of composite events having a start and end- 
point thus associating an interval wiih the composite event 
instead of using the timestamp of the terminating event. 
Then we can provide composition operators that allow for 
interval relations [1]. 

Applications with demands for high accuracy time 
stamping and timer signal handling, like real-time systems, 
are supposed to make use of special low-cost hardware 
equipment that directly integrates GPS time signals and 
may achieve down to 1 usee accuracy (21) and guarantees 
precision of down to 2 jisec. The foundations of the pro- 
posed interval based approach are in general applicable to 
such a high accuracy and high precision time environment. 
Our approach also fits well into mobile environments, pro- 
vided that the mobile devices are equipped with GPS 
receivers. 



We have implemented a prototype on top of a CORBA 
platform with multicast capabilities to experiment with 
accuracy interval based event composition. Currently wc 
are incorporating event composition based on interval rela- 
tions and are making extensions for up-call support. 
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